# Low-resource inference

Diffucoder 7B Cpgrpo 4bit
DiffuCoder-7B-cpGRPO-4bit is a 4-bit quantized version converted from the Apple DiffuCoder-7B-cpGRPO model, optimized for the MLX framework.
Large Language Model Other
D
mlx-community
218
1
Kimi Dev 72B GGUF
MIT
A quantized version of Kimi-Dev-72B, using advanced nonlinear optimal quantization and multi-head latent attention mechanism to reduce storage and computing requirements.
Large Language Model Other
K
ubergarm
2,780
1
Delta Vector Austral 24B Winton GGUF
Apache-2.0
A quantized version of the Austral-24B-Winton model of Delta-Vector, quantized using the llama.cpp tool, suitable for efficient operation on different hardware configurations.
Large Language Model English
D
bartowski
421
1
Qwen3 235B A22B 4bit DWQ 053125
Apache-2.0
This is a 4-bit quantized version converted from the Qwen3-235B-A22B-8bit model, optimized for the MLX framework and suitable for text generation tasks.
Large Language Model
Q
mlx-community
200
1
Phantom Wan 1.3B GGUF
Apache-2.0
This is a project that directly converts bytedance-research/Phantom to the GGUF format for image-to-video conversion tasks.
Text-to-Video English
P
QuantStack
315
3
Deepseek R1 0528 Qwen3 8B MLX 8bit
MIT
An 8-bit quantized version based on the DeepSeek-R1-0528-Qwen3-8B model, optimized for Apple Silicon chips and suitable for text generation tasks.
Large Language Model
D
lmstudio-community
151.87k
2
Llama 3.3 70b Instruct Deepseek Distilled GGUF
Apache-2.0
A multilingual text generation model fine-tuned based on unsloth/Llama-3.3-70B-Instruct-bnb-4bit, supporting English, Spanish, Latin, Arabic, and French.
Large Language Model Transformers Supports Multiple Languages
L
ykarout
202
1
Dans PersonalityEngine V1.3.0 24b Q4 K M GGUF
Apache-2.0
A multilingual text generation model based on Mistral-Small-3.1-24B-Base-2503, supporting 10 languages, suitable for role-playing and dialogue scenarios
Large Language Model Transformers
D
King-Cane
596
1
Gemma 3 1b It Fast GUFF
Quantized version optimized for low-end hardware and CPU-only environments, achieving production-ready inference configurations under resource constraints
Large Language Model
G
h4shy
101
1
Bielik 4.5B V3.0 Instruct GGUF
Apache-2.0
Bielik-4.5B-v3.0-Instruct-GGUF is a Polish large language model released by SpeakLeash, converted from Bielik-4.5B-v3.0-Instruct to GGUF quantized format, suitable for local inference.
Large Language Model Other
B
speakleash
693
4
Nousresearch DeepHermes 3 Llama 3 3B Preview GGUF
An instruction fine-tuned model based on the Llama-3-3B architecture, supporting tasks such as dialogue, reasoning, and role-playing, suitable for general artificial intelligence assistance scenarios.
Large Language Model English
N
bartowski
1,033
3
Llama 3 8B Instruct Abliterated TR
Ablated version of LLaMA-3-8B-Instruct, using specific techniques to force the model to respond in Turkish
Large Language Model Transformers Other
L
Metin
25
5
Zero Mistral 24B Gguf
MIT
Zero-Mistral-24B is a large language model based on the Mistral architecture, supporting Russian and English, suitable for dialogue and text generation tasks.
Large Language Model Supports Multiple Languages
Z
ZeroAgency
613
3
Orpheus 3b Kaya Q8 0.gguf
Apache-2.0
An 8-bit quantized text-to-speech model fine-tuned from Canopy Labs' pre-trained model, supporting 24kHz English audio generation
Speech Synthesis Supports Multiple Languages
O
lex-au
38
0
Google Gemma 3 27b It Qat GGUF
A quantized version based on Google Gemma 3's 27-billion parameter instruction-tuned model, generated using quantization-aware training (QAT) weights, supporting multiple quantization levels to meet different hardware requirements.
Large Language Model
G
bartowski
14.97k
31
Gemma 3 12b It GPTQ 4b 128g
This model is an INT4 quantized version of google/gemma-3-12b-it, using the GPTQ algorithm to reduce parameters from 16-bit to 4-bit, significantly decreasing disk space and GPU memory requirements.
Image-to-Text Transformers
G
ISTA-DASLab
1,175
2
3b Hi Ft Research Release Q4 K M GGUF
Apache-2.0
This is a GGUF format model converted from the canopylabs/3b-hi-ft-research_release model, supporting Hindi text processing.
Large Language Model Other
3
freddyaboulton
123
0
Turkish Llama 3 8B Function Calling GGUF
Apache-2.0
This is a Turkish function calling model fine-tuned based on the Turkish-Llama-8b-DPO-v0.1 model, specifically designed for executing Turkish function calling tasks.
Large Language Model Transformers Supports Multiple Languages
T
oncu
103
1
Huihui Ai Gemma 3 1b It Abliterated GGUF
This is a quantized version of Google Gemma 3B model, optimized based on llama.cpp, suitable for running in resource-limited environments.
Large Language Model
H
bartowski
3,123
3
Qwq 32B NF4
Apache-2.0
This is the 4-bit quantized version of the Qwen/QwQ-32B model, optimized using the BitsAndBytes library, suitable for text generation tasks in resource-constrained environments.
Large Language Model Transformers English
Q
ginipick
150
27
Rwkv7 0.1B G1
Apache-2.0
The RWKV-7 g1 model based on the Flash linear attention mechanism, supporting multilingual processing and having deep thinking ability
Large Language Model Transformers Supports Multiple Languages
R
fla-hub
377
5
MS3 RP Broth 24B
Apache-2.0
An intermediate model during the Tantum merging process, created by merging multiple 24B-parameter Mistral and Llama3 variants, suitable for role-playing and text generation tasks.
Large Language Model Transformers English
M
d-rang-d
337
6
Thor V2.5 8b FANTASY FICTION 128K Q4 K M GGUF
This is a GGUF-format converted 8B-parameter language model specialized for fantasy fiction, supporting 128K context length.
Large Language Model English
T
MrRobotoAI
22
0
Llasa 1B Q8 0 GGUF
This model is converted from HKUST-Audio/Llasa-1B into GGUF format, primarily designed for text-to-speech tasks.
Speech Synthesis Supports Multiple Languages
L
NikolayKozloff
16
3
SAINEMO Remix
A hybrid model based on multiple 12B parameter models, specializing in Russian and English role-playing and text generation
Large Language Model Transformers
S
Moraliane
201
36
Llama3 8B 1.58 100B Tokens
Large language model fine-tuned based on BitNet 1.58b architecture, with Llama-3-8B-Instruct as the base model, employing extreme quantization techniques
Large Language Model Transformers
L
HF1BitLLM
2,427
181
Bielik 11B V2.3 Instruct GGUF
Apache-2.0
This is the GGUF quantized version of the Polish large language model Bielik-11B-v2.3-Instruct developed by SpeakLeash, suitable for local deployment and use.
Large Language Model Transformers
B
speakleash
2,203
29
Phi 3 Mini 4k Instruct Q4 K M GGUF
MIT
This model was converted from microsoft/Phi-3-mini-4k-instruct to GGUF format using llama.cpp via ggml.ai's GGUF-my-repo space.
Large Language Model Supports Multiple Languages
P
matrixportal
67
3
Llama 3.1 Storm 8B
Llama-3.1-Storm-8B is a model developed based on Llama-3.1-8B-Instruct, aiming to improve the dialogue and function call capabilities of models with 8 billion parameters.
Large Language Model Transformers Supports Multiple Languages
L
akjindal53244
22.93k
176
Cere Llama 3.1 8B Tr
A fine-tuned version of the Llama3.1 8B large language model optimized for Turkish, trained on high-quality Turkish instruction datasets
Large Language Model Transformers Other
C
CerebrumTech
41
3
Gemma 2 27b It Q8 0 GGUF
This is a GGUF format model converted from Google's Gemma 2B model, suitable for text generation tasks.
Large Language Model
G
KimChen
471
2
Bitnet B1 58 Xl Q8 0 Gguf
MIT
BitNet b1.58 is a large language model with 1.58-bit quantization. It reduces the computational resource requirements by lowering the weight precision while maintaining performance close to that of a full-precision model.
Large Language Model Transformers
B
BoscoTheDog
326
7
Cere Llama 3 8b Tr
A fine-tuned version of the Llama3 8b large language model optimized for Turkish, trained on high-quality Turkish instruction datasets
Large Language Model Transformers Other
C
CerebrumTech
2,959
7
Llama 3 8b Ita
An Italian large language model optimized based on Meta-Llama-3-8B, supporting English and Italian text generation tasks
Large Language Model Transformers Supports Multiple Languages
L
DeepMount00
16.00k
27
Llama 3 8B Instruct GPTQ 4 Bit
Other
This is a 4-bit quantized GPTQ model based on Meta Llama 3, quantized by Astronomer, capable of efficient operation on low-VRAM devices.
Large Language Model Transformers
L
astronomer
2,059
25
Distil Whisper Small Cantonese
Apache-2.0
This is a distilled Cantonese speech recognition model based on Whisper Small, achieving a CER of 9.7 (without punctuation) on Common Voice 16.0.
Speech Recognition Transformers Chinese
D
alvanlii
187
7
Indic Gemma 2b Finetuned Sft Navarasa 2.0
Other
Multilingual instruction model fine-tuned on Gemma-2b, supporting 15 Indian languages and English
Large Language Model Transformers Supports Multiple Languages
I
Telugu-LLM-Labs
166
24
Yugo55a 4bit
MIT
Yugo55A-GPT is a Serbian-optimized large language model merged from multiple excellent models, demonstrating outstanding performance in Serbian LLM evaluations.
Large Language Model Transformers Other
Y
datatab
47
1
Minueza 32M Base
Apache-2.0
Minueza-32M-Base is a base model with 32 million parameters, fully trained on extensive English text corpora, suitable for text generation tasks.
Large Language Model Transformers English
M
Felladrin
68
18
Law LLM 13B GGUF
Other
Law LLM 13B is a specific domain foundation model developed based on LLaMA-1-13B, focusing on tasks in the legal domain.
Large Language Model Transformers English
L
TheBloke
420
8
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase